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Abstract 

We consider a threshold epidemic model on a clustered random graph with overlapping commu- 
nities. In other words, our epidemic model is such that an individual becomes infected as soon as the 
proportion of her infected neighbors exceeds the threshold q of the epidemic. In our random graph 
model, each individual can belong to several communities. The distributions for the community sizes 
and the number of communities an individual belongs to are arbitrary. 

We consider the case where the epidemic starts from a single individual, and we prove a phase 
transition (when the parameter q of the model varies) for the appearance of a cascade, i.e. when the 
epidemic can be propagated to an infinite part of the population. More precisely, we show that our 
epidemic is entirely described by a multi-type (and alternating) branching process, and then we apply 
Sevastyanov's theorem about the phase transition of multi-type Galton- Watson branching processes. 
In addition, we compute the entries of the matrix whose largest eigenvalue gives the phase transition. 

MSC classes: 60C05, 05C80, 91D30 
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1 Introduction 

The spread of diseases or e-mail viruses is well modeled by classical (SI, SIR or SIS) epidemics, whose 
study on complex networks has attracted a lot of attention in recent years (see Newman |21j for a review) . 
In such epidemics, each node can be independently influenced by each of her neighbor. For the diffusion 
of an innovation, individual's adoption behavior is highly correlated with the behavior of her neighbors 
[21], and threshold epidemic models are more appropriate to model such diffusions. We consider the 
game-theoretic contagion model, proposed by Blume [1] and Morris [IB], and described below. 

Consider a graph G in which the nodes are the individuals in the population and there is an edge 
if i and j can interact with each other. Each node has a choice between two possible actions labeled A 
and B. On each edge (i, j), there is an incentive for i and j to have their actions match, which is modeled 
as the following coordination game parametrized by a real number q G (0, 1): if z and j choose A (resp. 
B), they each receive a payoff of q (resp. (1 — q)); if they choose opposite actions, then they receive a 
payoff of 0. Then the total payoff of a player is the sum of the payoffs with each of her neighbors. If the 
degree of node i is d{ and Sf is the number of her neighbors playing B, then the payoff to i from choosing 
A is q(di — Sf) while the payoff from choosing B is (1 — q)Sf . Hence, in a best-response dynamic, i 
should adopt B if Sf > qdi and A if Sf < qdi- A number of qualitative insights can be derived from 
such a model even at this level of simplicity |15) . Specifically, consider a network where all nodes initially 
play A. If a small number of nodes are forced to adopt strategy B (the seed) and we apply best-response 



updates to other nodes in the network, then these nodes will be repeatedly applying the following rule: 
switch to B if enough of your neighbors have already adopted B. There can be a cascading sequence 
of nodes switching to B such that a network-wide equilibrium is reached in the limit. In this paper, we 
consider the case where a node that plays B is forced to play B forever (thus the number of players B 
is non-decreasing) and where the seed consists of only one vertex. The graph G is infinite, and we are 
interested in the cascade phenomenon, i. e. when an infinite part of the population adopted B at the end 
of the propagation. We will show a phase transition for this phenomenon, depending on the value of the 
parameter q of the model. 

When the graph G is deterministic, such phase transitions were proved by Morris [THj. The graph G 
that we consider here will have most of the properties observed in real- world networks. One of the most 
striking features shared by real- world networks is the scale- free property [2J: their degree distribution 
follows a power law. Random graphs with an arbitrary degree distribution [3] cover this property. The 
contagion model on such graphs was studied by heuristic means by Watts |26] , and a generalization of it 
(on the same kind of random graphs) was studied rigorously by Lelarge |16j . Another feature of real- world 
networks is that they all have a high clustering coefficient (Watts and Strogatz [25], see also Newman 
[2"0] for several examples). The clustering coefficient of a graph is by definition the probability that two 
given nodes are connected, knowing that they have a common neighbor. Since the asymptotic clustering 
coefficient of random graphs with an arbitrary degree distribution is zero (locally, they look like trees) , this 
random graph model fails to cover the clustering property of real- world networks. Recently, the contagion 
on clustered random graph models was studied by heuristic means by Gleeson [11], and rigorously in [7] 
and [3] (in which a generalization of the contagion model is considered). The random graph models 
considered have a tunable clustering coefficient and an arbitrary degree distribution, which in particular 
allows the study of the clustering impact on the contagion model. However, these random graph models 
do not cover the following property: in real-world networks, a node often belongs to several communities. 
A community is a set of nodes which are densely connected internally and only sparsely connected with 
other nodes of the network; in the clustered random graph models mentioned above, communities are 
represented by cliques, and a node can only belong to at most one clique. On the contrary, and as 
explained in |22j and |12j . the structure of many real- world networks is close to the one-mode projection 
of a bipartite graph, in which each node belongs (possibly) to several cliques (communities). 

A classical example of the one-mode projection of a bipartite graph is the collaboration graph of 
movie actors. Let r C V x E be a bipartite graph, i.e. a graph with two types of nodes: V-nodes and 
.©-nodes, and in which there are no edges between nodes of the same type. Each i?-node of V represents a 
'community' (a movie), and the U-nodes linked to a common E'-node are members of the same community 
(actors of the same movie). The one-mode projection of T on U-nodes (actors) is a unipartite graph: 
the nodes are the U-nodes of T, and there is an edge between two V- nodes if they belong to at least one 
common community (if these actors played together in at least one movie) . One can construct a random 
graph by considering the one-mode projection of a random bipartite graph. Up to our knowledge, neither 
rigorous proofs nor heuristics have been done for the contagion on such a model. There are several random 
bipartite graph models, and, even the literature on the classical SIR epidemic on the one-mode projection 
of such models is incomplete. When the random bipartite graph has arbitrary degree distribution for each 
one of both types of nodes, heuristics have been derived by Newman [TI5] for the classical SIR epidemic on 
the one-mode projection of it. Rigorous results (for the classical SIR epidemic) were obtained by Britton 
et al. [B] and by Bollobas et al. [5], but the random graphs considered are such that their asymptotic 
degree distributions are respectively Poisson and mixed Poisson. Recently, Hackett et al. [13 studied by 
heuristic means the contagion model on random graphs with overlapping communities, i.e. with nodes 
that can belong to several cliques. They also derive results about the clustering effect on the contagion 
spread (for their random graph model). However, the communities in that random graph model are only 
of size three (that model does not come from the one- mode projection of a random bipartite graph). 

Our random graph model is inspired from the one-mode projection of a random bipartite graph with 
arbitrary degree distributions. More precisely, we consider the one-mode projection of an alternating 
branching process that approximates locally this random bipartite graph (see [HI Section 7.2] for this 
approximation), and study rigorously the contagion on this random graph model. Our goal is twofold: 
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(i) we study rigorously the contagion on random graphs with overlapping communities, which, up to our 
knowledge, has not been done yet; (ii) our study provides heuristics for the contagion on the one- mode 
projection of random bipartite graphs with arbitrary degree distributions. 

This paper is organized as follows. In Section[2 we define our random graph model and recall its degree 
distribution and clustering coefficient. In Section 3j we state our theorem about the phase transition for 
the contagion spread on our random graph model with overlapping communities. In Section [4j we 
compute the entries of the matrix involved in the phase transition. Finally, in Section [5j we proved this 
phase transition, applying Sevastyanov's theorem about the phase transition of multi-type Galton- Watson 
branching processes. 



2 Random graph model and its basic properties 

In this paper, we consider the one-mode projection of an alternating branching process. We define this 
branching process in Section |2.1| and our random graph model in Section |2.2| The degree distribution 
and clustering coefficient of our random graph model are the same as those computed in [22J IV] , and 
are recalled in Section [2~3l 

Let p = {pd)d an d q = {q w ) w be two probability distributions with positive finite means A := J2d ^Pd € 
(0, oo) and /x := J2 W 2™ e (°> °°)- 



2.1 Alternating branching process 



Our random graph model is constructed from the following alternating branching process q), whose 
definition is given in Section 2.1.1 and phase transition (for T(p,q) to be finite/infinite) recalled in 
Section [5X21 



2.1.1 Definition 

The branching process q) is an alternating one: each node is either of type V or E, and a generation 
of V-nodes gives birth to a generation of -E-nodes, and conversely. 

Let D and W be independent random variables with the following distributions: 
P(5 = d-1) = %, F(W = w-l) = Wqw 



A [i 

The variable D (resp. W) will represent the offspring number of a V-uode (resp. -E-node), except for the 
root (whose offspring distribution is p). 

We can define formally the branching process F = T(p, q) as follows (in the following, only the 
definition of the i-th V and E-generations are required). The root has type V and offspring distribution 
p: its number D of children satisfies P(Z?o = d) = Pd- Let (Wj )i,j>i (resp. (D^ )j,fe>i) be random 
variables distributed as W (resp. D), all variables being independent and independent from D . Then 
each child j (1 < j < D ) of the root is an E-node that gives birth to Wj F-nodes, so that the root has 
£y = J2f=i Wj 1 '' grandchildren. Each such node k, 1 < k < £y , is a V-node that gives birth to 



E-nodes. Set = X)l=i -^i 1 "*- Notations are summarized in Figure [l] 



The root corresponds to ^-generation numbered 0, its children to _E-generation 0, and so on (until 
now, we constructed V and E-generations and 1). Assume generations to i— 1 are constructed (i > 2), 
with £e for the number of nodes in the (i — l)-th i?-generation. Then each node j, 1 < j < ^ 1 , 



gives birth to F-nodes, and we set (jffi — Y^j=i Wj 1 ^ for the total number of nodes in the i-th 
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Figure 1: Notations for the alternating branching process T(p,q). 



^-generation. Each such node k, 1 < k < gi ves birth to E'-nodcs, and we set — Y^fj=i 
for the total number of nodes in the i-th E'-generation. 

2.1.2 Extinction vs. survival 

Let D (resp. W) be a random variable with distribution p (resp. q). We define the following generating 
functions, for x € [0, 1]: 

F(x) = J2P* xd 

d 

G(x) = Y^x™- 1 
^ t 1 

W 

H{x) = J2<fx^ 

d 

The phase transition for T(p, q) to be finite/infinite is given by the next proposition, which is a direct 
consequence of Theorem 1 in QJ Section I. A. 5] (heuristics can also be found in [551 IV.A]): 

Proposition 1. Let p ext be the probability that the branching process q) is finite. 

• IfE[W(W- 1)]E[D(D- 1)] < E[W]E[Z>], then p ext = 1; 

• IfE[W(W - l)]E[D(D - 1)] > E[V7]E[£>], then p ext < 1. More precisely, we have: 

Pext = (FoG) (7]) , 
where r\ := inf {x € [0, 1] : (H o G)(x) = x}. 

2.2 Random graph model 

We construct the (rooted and unipartite) random graph G = Gr{p,q) as follows. The root of G is the 
root of r = T(p, q), and the parent and the children of each E-node e in T are connected into a clique 
before e is removed, as illustrated in Figure[2] In other words, the random graph Gr(p, q) is the one-mode 
projection of T. 

Even if the graph G is not a tree, its particular construction allows using the same terminology as 
for a tree (see Figure [2]). More precisely, let v be a node in G, and let d its distance from the root. By 
construction, there is exactly one node it, among the neighbors of v, which is at distance d — 1 from the 
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r(p,«) 



<=> 



Gr(p,q) 




parent of v 




children of v 

Figure 2: Random graph model Gr(p, q), constructed from the branching process T(p, q). 

root: it is called the parent of v. In addition, the neighbors of v that are at distance d from the root are 
called the brothers of v, and those at distance d + 1 the children of u. If we consider a given clique of G, 
the parent of the clique is the node whose distance from the root is minimal. 

Note that the random graph G is infinite if and only if Y is, so that the phase transition for G to be 
finite/infinite is given by Proposition [I] above. 

Remark. The branching process T is a local approximation for the random bipartite graph B — 
B(p, q) with arbitrary degree distributions p and q [HI Section 7.2]: informally, the root of the branching 
process T represents a 'typical' vertex in B. Hence our random graph model G is a local approximation 
for the one-mode projection Gg(p, q) of B, and the root of G represents a 'typical' vertex in Gig(p, q). 



2.3 Degree distribution and clustering coefficient 

Proofs of the next proposition that uses generating functions can be found in the paper of Newman [19] . 

Proposition 2 f|19p. We consider the random graph Gr(f>, </)■ Then the degree distribution D' n of the 
root is given by: 



F(D' = k) = J2p* E 

d— 1 W\-\ \-Wd— k-\-d 



Let T be the number of triangles the root belongs to, and let Pq be the number of connected triples 
the root belongs to. Then the local clustering coefficient C of the root is given by: 



C 



E(T ) 



l m nW(W~l)(W-2)] 



E[IV] 



E(Po) E[J(D-1)] ( E[W(W-1)] 



-^nW(W-l)(W-2)} 
where D (resp. W) is a random variable with distribution p (resp. q). 



3 Phase transition for the contagion: statement of our result 

In Section |3.1[ we define the contagion model described in the introduction, on the random graph G 
Gr{p, l)- I R Section 3.2 we state our main theorem (whose proof is given in Section [5J. 
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3.1 Epidemic model 



Let q € (0,1). We consider the contagion model described in the introduction, on the random graph 
G = Gr(p,q), with parameter q and the seed consisting in the root only. For simplicity, players B are 
called active vertices, and players A inactive vertices. The progressive dynamics of the contagion on 
the graph G operates as follows: the root starts out being active; all other vertices are inactive. Time 

operates in discrete steps t = 1, 2, 3, At a given time t, any inactive vertex becomes active if its 

proportion of active neighbors is strictly greater than q. 

Let G^ct be the graph induced by the active vertices of the graph G. We say that a cascade occurs if 
the graph G^ t of active vertices is infinite. We will show a phase transition for the cascade phenomenon, 
i.e. for the graph G^t t° be finite/infinite. 

Remark. We consider a clique of size w whose parent u becomes active at time t. The other nodes 
inside the clique can be activated until time t + w — 1. More precisely, the first nodes (different from u) 
that are possibly activated inside the clique are those with fewer children. If such activations occur at 
time t + 1, then it can turn other nodes of the clique into active ones at time t + 2, and so on. 



3.2 Phase transition 

In the following, we make the additional assumption that the degrees of nodes in L(p, q) are bounded: 
there exist d, w > 1 such that Pd = for d > d and q w — for w > w. In addition, we assume that 
Po = qa = Qi = 0. 



We consider the random graph G = Gr(p,q) defined in Section 2.2 We define the type of a vertex 
in G as its number of children in G. For all Xq, x € {0, 1, . . . , dw}, we set m XQX for the mean number of 
active children of type x of an active vertex of type xq in G. In other words, we consider an active vertex 
u, different from the root, with xq children (if the probability that there exists such one is zero, then we set 
nixo^ = 0). Once the contagion has spread among all its children, we count the number of such children 
that are active and have exactly x children. The mean of this quantity (among all possible realizations 
of G for the children and grandchildren of u) is called m Xo X . The matrix M — (m XQ X ) ()<x 2 , < j tD will be 
computed in Section [4j 

We have the following phase transition for the graph G^j. to be finite/infinite, that will be proved in 
Section 

Theorem 3. Let pf be the probability that the random graph G^ t is finite, and let p be the largest 
eigenvalue of M . Then we have: 

(i) if q > 1/2, thenpf=l; 

(ii) if q < 1/2, then 

• if P2 = qi = 1, then the random graph G a q } t is infinite with probability one; 

• otherwise pj — 1 if and only if p < 1. 



4 Computation of the matrix M 



Before computing the entries of the matrix M defined in Section 3.2 we state two lemmas about the 



number of active vertices inside a given clique. These lemmas will also be used in Section 5.1 (to define 
the multi-type branching process that describes the contagion spread in Gr(p, <?))• 



G 



4.1 Lemmas about the number of active vertices inside a given clique 



The following lemma computes the number of active vertices inside a clique, when the contagion has 
spread inside the clique. 

Lemma 4. We consider the contagion spread inside a clique of size w, whose parent u is initially active. 
Let L be the final number of active children (active vertices different from u) inside the clique. We denote 
by {1, . . . , w — 1} the set of children inside the clique, and set Xi for the number of children of vertex i, 
1 < % < w — 1. Then L satisfies the following equation: 



L = min 



[i e {l, . . . , W - 1}| [q{X (V) + W - 1)J + 1 > ij - 1, (1) 



where Xn\ = min^ Xi < X( 2 ) < • • • < -XV-uj-i) = maxjXj is the order statistics of (Xi)i<i< w -i, and 
L = w — 1 if this set is empty (i.e. if \_q{Xu\ + w — 1)J + 1 < i for all 1 < i < w — 1). 

Proof. By definition, a node i, 1 < i < w — 1, becomes active if and only if the proportion of its active 
neighbors is strictly greater than q, i.e. if and only if its number of active neighbors is at least 

Ai := [q-(X t +w-l)\ +1. (2) 

We use the order statistics of (Ai)i<i< w -i (or equivalently the one of (^)i<i<«>-i) : nodes with fewer 
children need fewer active neighbors to become active, and the first node(s) to become possibly active is 
(are) the one(s) with ^(i) children. More precisely, if ^4^) > 1, no node different from u inside the clique 
can be activated, and L = 0. If Am < 1, then at least one node (different from u) is activated. Then a 
second one is also activated if and only if A( 2 ) < 2, and the lemma follows by a simple induction. 

□ 

The next lemma provides the joint distribution of L and the order statistics (-^(i)) 1< <£/ ' The random 
variables (^Q)i<i<M,-i are i.i.d. and distributed as the following random variable X, which represents the 
number of children of a vertex different from the root in Gr(p, q) (this equation is similar to the one of 
Proposition [2]) : 

d=l (tu 1 -l)+-+(-ui (J _ 1 -l)=fc 
For any sequence x\ < x 2 < • • • < xi and any i £ {1, ...,£— 1}, we set (omitting the dependency on 

_ J max{j > 0\xi = x i+ j} + 1 if Xi-i < Xi or i = 1, 

[1 if Xi-i = Xi, 

When Xi-i < Xi (or i — 1), Sj is the number of i' > i such that xy = xi. For instance, if £ = 6 and 
cci < = £3 = Xi < X5 = xq, then si = 1, S2 = 3, S3 = S4 = 1 and S5 = 2, so that s^! = 3! 2!. 

Lemma 5. PFe consider a clique of size w whose parent u is initially active, and use the same notations 
as for Lemma^ Then we have, for any 1 < £ < w — 1 and < x\ < x 2 < ■ ■ ■ < xg,: 

p (B) (^,xi, . . . ,Xi\w) := F(L = £,Xm=xi,...,X(e)=Xi) 



n i {L9(^+u--i)J+i<d 



(w-l)\ 
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p {E \0\w) := P(£ = 0) 

= (P(Lg(X + w-l)J >0)J . 

Proof. The joint distribution of the order statistics for a sequence of i.i.d. discrete random variables 
0^)i<i<n' distributed as Y, is given by the following equation, for any y\ < y 2 < • • • < y n (see [TTJl 
Equation (2.3)]): 

Tl t — r 

P(F (1) = yi ,...,y (n) =y n ) = • | n p ( y = ^) ( 5 ) 

=: p Y (yi,---,y n ), 

where (si) i (defined in Q) correspond to the sequence {y\, . . . ,y n )- 

If [q(xi + w — 1)J + 1 > i for some 1 < i < £, then L < I (due to (FT])), so that p(£, x%,..., xi\w) = 0. 



We assume that [q(x,i + w — l)j + 1 < i for all 1 < i < I. Then we have, using equation (JlJ for the 
random variable X defined in ([3]): 

P^H^Xl, ■ ■ ■ ,Xi\w) = ^ Pxfat, ...,X W -!) l{Lg(x« +1 +K)-l)J>Q 

X£<Xi + i<---<X w -i 

= ^ Px (si , ■ • • , - 1 ) 1 { l<,( I(+1 >£} (6) 

The last equality comes from the fact that [q(xg + w — 1)J + 1 < I. We set: 



w — 1 



Then we have, using Q and replacing px{x\, ■ ■ ■ , x w —x) by its expression in (15]): 

\iii=i s »- i=i ) 



)]>£}■ 



(w-i- ey. 



We now compute gx(£,w). Let Zi, . . . , Z w _\_i be z.i.rf. random variables distributed as X, and set 
Zq) '■= m i n i<fc<Mi-i-£ %k- Then we have, using |5]): 

9x(Z,w) = 22 Px(^+i,---,^-i)l{L«(^+i+«)-i)J>Q 

X£ + i<---<a; TO -l 

= P(L?-(Z(i)+u;-l)J>^) 



(P(L3(X + W -1)J >*)) 



u;— 1- 



This proves the first equation, and the second one follows from 

P(£ = 0) = W([q(Xi + w - 1)J + 1 > 1, for alH < w - 1) 
= (P([q(X + w-l)\ >0)J , 

which ends the proof. □ 
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4.2 The number of active vertices of a given type 



The next proposition gives the computation for the matrix M (whose largest eigenvalue is used in The- 
orem [3]). 

Proposition 6. We define the following quantities, for < x < dw, 1 < w < W, 1 < k < W — 1, 
k + [q(x + w — 1)J < £ < w — 1 and [q(x + w — 1)J < i < £ — k (with p( E ' (£,Xi, . . . ,xg\w) defined in 
Lemma^): 

S(x t k,t,i) : = •■ ■ ,Xt)\xi < ...Xi < X l+ i = ■ ■ ■ = X i+k = X < Xi+k+l < x e } 

P x \ w (k,t,i) := ^2 p {E \t,X!,...,xe\w) 

(ii,...,i,)es (l . )t|iii) 

m x \ w := ^2 fc ' EI EI P x \w{k,e,i)\ 

l<fe<w-l k+lq(x+w-l)\<l<w-l \ Yq(x+w-l)\ <i<£~k I 



Let v be an active vertex in G — Gy(p, q) with type xq, ^ xq ^ dw. Then the mean number m XQ X 
of its active children having type x, < x < dw, is mo, x = if x$ = and otherwise is given by: 

m X0 , x = }_^— ^ }_^m x]Wt . (7) 

d=l (w 1 -l) + -- + (w d _ 1 -l)=x M i=l 



Proof. Using the definition of p( E \i,Xi, . . . ,xe\w) in Lemma [HJ one can easily see that P x \ w {k,l, i) is 
the probability that a clique of size w, whose parent is initially active, has exactly £ active children, i of 
which having type strictly less than x and exactly k of which having type x. 

Hence the mean number, for a clique of size w, of its active children having type x is given by: 

N x \ w := Yl k ' J2 E Px\ w {k,l,i) J • 

l<fe<tu-l 0<t<w-l yo<i<^-fe J 

In addition, for the children of type x to be activated inside the clique, there must be at least [q(x + 
to — 1)J + 1 active children of type strictly less than x (due to ([2])). In other words, for P x \ w (k, £, i) to be 
positive, one should have i + 1 > [q(x + w — 1)J + 1. Moreover, we have: £>k+i>k+ [q(x + w — 1)J , 
so that N x \ w = m x \ w . 

The end of the proof follows easily, since we can study independently two different cliques having the 
same parent: with the notations of Figure [2] if the parent of v is active, the fact that v becomes active 
or not is independent of the activation/non- activation of v', when v is not a brother of v. 

□ 



5 Proof of Theorem [3] 



This section is organized as follows. First we define a multi-type (and alternating) branching process 
that completely describes the contagion spread in G = Gr(p, q) (using Lemma [5] of Section 4.1 ). Second 
we recall Sevastyanov's theorem about the phase transition for multi-type Galton- Watson branching 
processes [23]. We then use this theorem to prove Theorem [3] (stated in Section [3^2] ) . 
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5.1 Description of the contagion model by a multi-type (and alternating) 
branching process 

We now define an alternating branching process V in which the number of U-nodes in the i-th V- 
generation is distributed as the number of vertices in the i-th generation of G^t (f° r arL Y * > 0). As 
explained in Lemma[4j the number L of vertices at the end of the contagion spread inside a clique depends 
on: 

• the clique size w, 

• the number of children in Gr(p, q) of each child i, 1 < i < w — 1, inside the clique. 



These dependencies lead to considering a multi-type branching process for T' (for an introduction on such 
processes, see for instance Harris [TH Chapter 2] or Mode [T7J Chapters 1 and 2]). 



We need the following notation. For all 1 < d < d and < W\, W2, ■ ■ ■ 

SV) (J 1 ,^ . , d Pd 



, Wd-i < w, we set: 

/d-l 



p { >(d- 1,W!, . . .,w d _i\x) := t 



{Ef^K-i)^} A 
a;! 

'(floG)W(O)' 



n 



Wi(h 
/' 



(8) 



where the generating functions G and H are defined in Section 
point of the x-th derivative of H o G. 



2.1.2 



and (H o G) (x ^{0) is the value at 



Proposition 7. We define an alternating and multi-type branching process V as follows. The root is a 
V -node that gives birth to d (< d) E -nodes with probability pd- Each of these E -nodes has type w with 
probability wq w //i. 

An E-node of type w gives birth to £ € {0, . . . , 10 — 1} V -nodes of types < x\ < X2 < ■ ■ ■ < xi < dw 
with probability p( E '(l,Xi, . . . ,xg\w) defined in LemmaUA 



Finally, a V -node of type x > gives birth to d — 1 E-nodes of types w\, 
p( v \d — 1, wi, . . . , Wd-i\x) defined above. 



,Wd-i with probability 



As before, we set T — q) for the alternating branching process defined in Section 2.1 G — Gr(p, q) 
for the correspon ding random graph model, and G a q } t f or the random graph of active vertices in G (as 
defined in Section 3.1). Then there is a coupling between V and G a q } t such that: 



• the number of V -nodes in the i-th V -generation ofV is distributed as the number of vertices in the 
i-th generation of G a q } t (f or an 1S i ^ 0); 

• the type w attached to each E-node in T' corresponds to its 'size' in T (i.e. its offspring number in 
r is w — 1); 

• the type x attached to each V -node in T' corresponds to its number of children in G (i.e. its number 
of grandchildren in T is x). 



Proof. Using Lemma|5]and the fact that the contagion spreads independently in two different cliques, we 



are left to prove that, when Xh=i { w i — 1) = x -> ^ ne probability 



P := P \D = d - 1, W x = W! - 1, . . . , Wf) = w d -i - 1 



D-l 
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is given by ([8]), where D is the offspring number of a V^-node in Gr(p,q) (with generating function H), 
and (Wi) i the offspring numbers of _E-nodes in Gr(p, q) (with generating function G). 



We have: 



P = 



V(D = d - 1, Wi = W! - 1, . . . , W 6 = Wd-x) 



The numerator is easy to compute: 

F{D = d-l,Wi = w 1 -l,...,W 5 = w d - 1 ) = 



d-l 



dpd rr WiAwj 
A hi fi 



We use generating functions to compute the denominator. We have, for all y € [0, 1]: 



E 



Ef =1 W 



= ^P(L> = d-l)E 



d=l 



V 

d-l 



d 

(HoG)(y) 



which leads to: 




= x \ = 



-AHaG)^{<S), 



and ends the proof. 



□ 



By definition, there is a cascade in Gr(p, q) if G^t is infinite, which occurs if and only if the branching 
process T' is infinite. 



5.2 Phase transition for multi-type Galton- Watson branching processes 

We refer to Harris [141 Chapter 2] for further information on multi-type Galton- Watson branching pro- 
cesses. Let Z be a multi-type Galton- Watson branching process, with k types, starting from only one 
individual of a given type io (results are the same when the process starts from a finite number of indi- 
viduals, but we are interested only in this case here). Let M = {mij) 1<i j <k be the first-moment matrix 
of Z, i.e. rriij is the mean number of children of type j that is created~by a single individual of type i. 

We next recall Sevastyanov's theorem about the phase transition for multi-type Galton- Watson 
branching processes, that we will use in the next section. We define final classes as in [TU Chapter 

in) 

2J. Let m\j be, for n > 1, the element in the i-th row and j-th column of M n . The types i and j 

communicate if m\j,m^" > for some n,n' > 1. A type that communicates neither with itself nor 
with any other type is called singular; a class is a set of types, each pair of which communicate, that 
is not contained in any other set having this property. The types fall uniquely into singular types and 
mutually exclusive classes. A final class C is a class in which any individual of type i e C has probability 
one to give birth to exactly one individual with type in C (other individuals whose type is not in C may 
also be produced). Let p be the largest eigenvalue of M. Sevastyanov's theorem can be stated as follows: 
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Theorem 8 (|23j). The probability of extinction of the branching process Z is one if and only if (a) p < 1 
and (b) there are no final classes. 

The proof of this theorem originally appeared in a paper of Sevastyanov |23j . The statement used 
here corresponds to [TH Chapter 2, Theorem 10.1]. 

Remark. There is a classical version of Theorem [8] (with a simpler proof) for positively regular 
processes. The process Z is said to be positively regular if there exists n > 1 such that M n is positive, 
i. e. all entries of M" are (strictly) positive. The particular case of Theorem [8] for positively regular 
processes can be found in [141 Chapter 2, Theorem 7.1], or in any book dealing with multi-type branching 
processes (see also Mode [T71 Chapter 1] or Athreya and Ney [T] Chapter 5] for instance). Some examples 
of non-positively regular processes are also considered in [17l Chapter 2]. The hypothesis that the 
branching process has no final class is reduced to the hypothesis of non-singularity in the positively 
regular case. 

We will see in the proof of Theorem [3] that the process we consider is not positively regular as soon 
as pi > for instance, so that we need the stronger result of Sevastyanov. 



5.3 Proof of Theorem |3] 

We construct a non- alternating branching process T' v from V by erasing .E-generations, so that a V- 
generation directly gives birth to the next V-generation, and we apply Theorem [8] to the branching 
process T' v . 

Note that mo x = for all x implies that the process T' v is not positively regular (if pi > 0), as 
mentioned in the remark following Theorem [8] 

Case (i) is obvious. We assume that q < 1/2. We will show that there is no final classes in T' v if 
and only if either p% < 1 or q% < 1. Then applying Theorem [8] ends the proof (the case P2 = gi = 1 is 
direct). We first show that, if there exists a final class C in T' v , then necessarily C = {1}. We start with 
a definition and a lemma. 

Definition. Let < x < dw. A configuration starting from x is an element o~ x of the form 

0~x = ((Wi,Xii, . . . tXili)) , 

V / i<i<d— l 

where 1 < d < d, 2 < Wi < w, < £i < Wi ~ 1 and < xn < • • • < x^. < dw for all 1 < i < d — 1. We 
say that a configuration o~ x has positive probability to occur if 

• Pd > 0, 

• EtiK - !) = x > 

• q Wi > and p^ E \ii,xn, . . . , x^ \w{) > for all 1 < i < d — 1. 

In other words, a configuration o~ x of positive probability is a possible realization for the next two 
generations starting from a V-uode v of type x in V (it contains all the information about the children 
and grandchildren of v in the sense that Wi represent the types of the children of v in T', and x^j the 
types of its grandchildren). The next lemma provides a simple way to construct new configurations a' x of 
positive probability, if one knows a given configuration o~ x of positive probability (the proof is obvious, 
using the expression of p^ E \ii, Xn, . . . , Xu^Wi) given in Lemma [HJ): 

Lemma 9. Let a x = I (wi, Xn, . . . , Xu t ) ) be a configuration that occurs with positive probability. 

V / l<i<d— 1 

Let {yij) i j be such that y^ < Xij and J?(X = yij) > for all 1 < i < d — 1, 1 < j < where 
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the distribution of X is given by ([3]). Then the new configuration o~' x = I (wi,yn, . . . , yu t ) ) still 

V / l<i<d— 1 

occurs with positive probability. 

We assume that there is at least one final class C, and first prove that necessarily C = {1}. Let 
x € C be such that P(X = x) > 0, and let v be a y-node in T' with type x. We assume by contradiction 
that x 7^ 1. Since is singular, necessarily x > 1. By the definition of a final class, the number 
./V of children of v whose type is in C is one almost surely. Hence there exists a configuration a x = 

I (wi, xn , . . . , xn- ) ) with a positive probability to occur and in which there exists a unique couple 

V * / i<i<d-i 

(*OiJo) such that Xi j € C. We distinguish several cases: 

• If there exists (k,£) such that x^e > Xi j 01 we consider the new configuration a' x with = x%j 
if (hj) 7^ (&>^) an( i = x iojo- Then N = 2 in o~' x , and cr^, has positive probability to occur by 
Lemma |9j which is a contradiction. 

• If there exists (fc,£) such that x^e < Xi j , we consider the new configuration o~' x with y t j = x^ if 
(hj) (*OiJo) an d Viojo = x ki- Then N = in a' x , and u^, has positive probability to occur by 
Lemma [9j which is a contradiction. 

• Otherwise, v has only one grandchild (of type Xi j ) in V . We first consider the case where Wi a = 2. 
Since J2i=i( w i — 1) = a; > 1, there exists k ^ i a such that Wfe > 2. We construct a' x by replacing 
Wfc by Wk — 1 i?-nodes of type 2, each of which having a T^-node of type Xi j as a child. Hence 
N = Wk > 2 in cr^., and it is easy to see that cr^, has positive probability to occur, which is a 
contradiction. 

• The remaining case is when v has only one grandchild (of type Xi j ) in L', and Wi > 2. We 
construct a' x by replacing (wi , Xi j ) by (wi , ^i j , a;i io ) ■ Hence N = 2 in a' x , and it is easy to see 
that a' x has positive probability to occur, which is a contradiction. 

Hence C = {1}, which means that v has only one grandchild in the original branching process L = L(p, q). 
It implies in particular that (72 > 0, and we are left to compute the following quantity: 

P W(e = i,x = i\w = 2) = i L2 ,j< p(x = i) 

2P2 2^2 

A /1 



By the definition of a final class, p^ B '{t = l,x = l\w 
the proof. 

6 Conclusion and perspectives 



= 2) = 1, which implies that p 2 = Q2 = 1, and ends 



We studied rigorously the contagion [U [T5] on a clustered random graph model with overlapping com- 
munities. Our random graph model allows an arbitrary distribution for the community sizes, while the 
heuristic study of Hackett et al. |13j was done on a random graph model with communities of size three. 
Up to our knowledge, these are the only two studies of this epidemic model on random graphs with 
overlapping communities. There are several dependencies that made this study challenging: (i) in the 
epidemic model itself, since the behavior of an individual depends on the behavior of all her neighbors; 
(ii) in the random graph model considered, that allows an arbitrary distribution for both the community 
size and the number of communities an individual belongs to. In addition, our study provides heuris- 
tics for the contagion on the one-mode projection of a random bipartite graph with arbitrary degree 
distributions, which is well appropriate for modeling real- world networks [2"2l IT2"] . 

We showed that our epidemic is completely described by a multi-type and alternating branching 
process, and use a non-classical theorem on phase transitions for multi-type branching processes, referred 
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to as Sevastyanov's theorem, to prove a phase transition for our process. This opens the way to the study 
of the clustering effect on the cascade phenomenon in this case, as done in [T3] or [5]. 
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